Part 1:
Part 2:
%load_ext autoreload
%autoreload 2
# uncomment this if running locally or on Google Colab
!pip install --upgrade hepml
Requirement already satisfied: hepml in c:\users\dell\anaconda3\lib\site-packages (0.0.12) Requirement already satisfied: wget in c:\users\dell\anaconda3\lib\site-packages (from hepml) (3.2) Requirement already satisfied: pandas in c:\users\dell\anaconda3\lib\site-packages (from hepml) (1.3.4) Requirement already satisfied: numpy in c:\users\dell\anaconda3\lib\site-packages (from hepml) (1.21.5) Requirement already satisfied: nbdev in c:\users\dell\anaconda3\lib\site-packages (from hepml) (1.2.2) Requirement already satisfied: graphviz in c:\users\dell\anaconda3\lib\site-packages (from hepml) (0.19.1) Requirement already satisfied: pillow in c:\users\dell\anaconda3\lib\site-packages (from hepml) (8.4.0) Requirement already satisfied: scikit-learn in c:\users\dell\anaconda3\lib\site-packages (from hepml) (0.24.2) Requirement already satisfied: seaborn in c:\users\dell\anaconda3\lib\site-packages (from hepml) (0.11.2) Requirement already satisfied: tqdm in c:\users\dell\anaconda3\lib\site-packages (from hepml) (4.62.3) Requirement already satisfied: gdown in c:\users\dell\anaconda3\lib\site-packages (from hepml) (4.4.0) Requirement already satisfied: black in c:\users\dell\anaconda3\lib\site-packages (from hepml) (19.10b0) Requirement already satisfied: giotto-tda in c:\users\dell\anaconda3\lib\site-packages (from hepml) (0.5.1) Requirement already satisfied: pyarrow in c:\users\dell\anaconda3\lib\site-packages (from hepml) (7.0.0) Requirement already satisfied: sklearn-pandas in c:\users\dell\anaconda3\lib\site-packages (from hepml) (2.2.0) Requirement already satisfied: Cython in c:\users\dell\anaconda3\lib\site-packages (from hepml) (0.29.24) Requirement already satisfied: numba in c:\users\dell\anaconda3\lib\site-packages (from hepml) (0.55.1) Requirement already satisfied: fastprogress in c:\users\dell\anaconda3\lib\site-packages (from hepml) (1.0.2) Requirement already satisfied: toml>=0.9.4 in c:\users\dell\anaconda3\lib\site-packages (from black->hepml) (0.10.2) Requirement already satisfied: regex in c:\users\dell\anaconda3\lib\site-packages (from black->hepml) (2021.8.3) Requirement already satisfied: click>=6.5 in c:\users\dell\anaconda3\lib\site-packages (from black->hepml) (8.0.3) Requirement already satisfied: appdirs in c:\users\dell\anaconda3\lib\site-packages (from black->hepml) (1.4.4) Requirement already satisfied: pathspec<1,>=0.6 in c:\users\dell\anaconda3\lib\site-packages (from black->hepml) (0.7.0) Requirement already satisfied: attrs>=18.1.0 in c:\users\dell\anaconda3\lib\site-packages (from black->hepml) (21.2.0) Requirement already satisfied: typed-ast>=1.4.0 in c:\users\dell\anaconda3\lib\site-packages (from black->hepml) (1.4.3) Requirement already satisfied: colorama in c:\users\dell\anaconda3\lib\site-packages (from click>=6.5->black->hepml) (0.4.4) Requirement already satisfied: requests[socks] in c:\users\dell\anaconda3\lib\site-packages (from gdown->hepml) (2.26.0) Requirement already satisfied: beautifulsoup4 in c:\users\dell\anaconda3\lib\site-packages (from gdown->hepml) (4.10.0) Requirement already satisfied: filelock in c:\users\dell\anaconda3\lib\site-packages (from gdown->hepml) (3.3.1) Requirement already satisfied: six in c:\users\dell\anaconda3\lib\site-packages (from gdown->hepml) (1.16.0) Requirement already satisfied: soupsieve>1.2 in c:\users\dell\anaconda3\lib\site-packages (from beautifulsoup4->gdown->hepml) (2.2.1) Requirement already satisfied: python-igraph>=0.8.2 in c:\users\dell\anaconda3\lib\site-packages (from giotto-tda->hepml) (0.9.9) Requirement already satisfied: pyflagser>=0.4.3 in c:\users\dell\anaconda3\lib\site-packages (from giotto-tda->hepml) (0.4.4) Requirement already satisfied: plotly>=4.8.2 in c:\users\dell\anaconda3\lib\site-packages (from giotto-tda->hepml) (5.6.0) Requirement already satisfied: ipywidgets>=7.5.1 in c:\users\dell\anaconda3\lib\site-packages (from giotto-tda->hepml) (7.6.5) Requirement already satisfied: joblib>=0.16.0 in c:\users\dell\anaconda3\lib\site-packages (from giotto-tda->hepml) (1.1.0) Requirement already satisfied: scipy>=1.5.0 in c:\users\dell\anaconda3\lib\site-packages (from giotto-tda->hepml) (1.7.1) Requirement already satisfied: traitlets>=4.3.1 in c:\users\dell\anaconda3\lib\site-packages (from ipywidgets>=7.5.1->giotto-tda->hepml) (5.1.0) Requirement already satisfied: ipython>=4.0.0 in c:\users\dell\anaconda3\lib\site-packages (from ipywidgets>=7.5.1->giotto-tda->hepml) (7.29.0) Requirement already satisfied: ipython-genutils~=0.2.0 in c:\users\dell\anaconda3\lib\site-packages (from ipywidgets>=7.5.1->giotto-tda->hepml) (0.2.0) Requirement already satisfied: jupyterlab-widgets>=1.0.0 in c:\users\dell\anaconda3\lib\site-packages (from ipywidgets>=7.5.1->giotto-tda->hepml) (1.0.0) Requirement already satisfied: ipykernel>=4.5.1 in c:\users\dell\anaconda3\lib\site-packages (from ipywidgets>=7.5.1->giotto-tda->hepml) (6.4.1) Requirement already satisfied: nbformat>=4.2.0 in c:\users\dell\anaconda3\lib\site-packages (from ipywidgets>=7.5.1->giotto-tda->hepml) (5.1.3) Requirement already satisfied: widgetsnbextension~=3.5.0 in c:\users\dell\anaconda3\lib\site-packages (from ipywidgets>=7.5.1->giotto-tda->hepml) (3.5.1) Requirement already satisfied: debugpy<2.0,>=1.0.0 in c:\users\dell\anaconda3\lib\site-packages (from ipykernel>=4.5.1->ipywidgets>=7.5.1->giotto-tda->hepml) (1.4.1) Requirement already satisfied: jupyter-client<8.0 in c:\users\dell\anaconda3\lib\site-packages (from ipykernel>=4.5.1->ipywidgets>=7.5.1->giotto-tda->hepml) (6.1.12) Requirement already satisfied: tornado<7.0,>=4.2 in c:\users\dell\anaconda3\lib\site-packages (from ipykernel>=4.5.1->ipywidgets>=7.5.1->giotto-tda->hepml) (6.1) Requirement already satisfied: matplotlib-inline<0.2.0,>=0.1.0 in c:\users\dell\anaconda3\lib\site-packages (from ipykernel>=4.5.1->ipywidgets>=7.5.1->giotto-tda->hepml) (0.1.2) Requirement already satisfied: decorator in c:\users\dell\anaconda3\lib\site-packages (from ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (5.1.0) Requirement already satisfied: setuptools>=18.5 in c:\users\dell\anaconda3\lib\site-packages (from ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (58.0.4) Requirement already satisfied: jedi>=0.16 in c:\users\dell\anaconda3\lib\site-packages (from ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.18.0) Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in c:\users\dell\anaconda3\lib\site-packages (from ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (3.0.20) Requirement already satisfied: backcall in c:\users\dell\anaconda3\lib\site-packages (from ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.2.0) Requirement already satisfied: pickleshare in c:\users\dell\anaconda3\lib\site-packages (from ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.7.5) Requirement already satisfied: pygments in c:\users\dell\anaconda3\lib\site-packages (from ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (2.10.0) Requirement already satisfied: parso<0.9.0,>=0.8.0 in c:\users\dell\anaconda3\lib\site-packages (from jedi>=0.16->ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.8.2) Requirement already satisfied: python-dateutil>=2.1 in c:\users\dell\anaconda3\lib\site-packages (from jupyter-client<8.0->ipykernel>=4.5.1->ipywidgets>=7.5.1->giotto-tda->hepml) (2.8.2) Requirement already satisfied: pyzmq>=13 in c:\users\dell\anaconda3\lib\site-packages (from jupyter-client<8.0->ipykernel>=4.5.1->ipywidgets>=7.5.1->giotto-tda->hepml) (22.2.1) Requirement already satisfied: jupyter-core>=4.6.0 in c:\users\dell\anaconda3\lib\site-packages (from jupyter-client<8.0->ipykernel>=4.5.1->ipywidgets>=7.5.1->giotto-tda->hepml) (4.8.1) Requirement already satisfied: pywin32>=1.0 in c:\users\dell\anaconda3\lib\site-packages (from jupyter-core>=4.6.0->jupyter-client<8.0->ipykernel>=4.5.1->ipywidgets>=7.5.1->giotto-tda->hepml) (228) Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in c:\users\dell\anaconda3\lib\site-packages (from nbformat>=4.2.0->ipywidgets>=7.5.1->giotto-tda->hepml) (3.2.0) Requirement already satisfied: pyrsistent>=0.14.0 in c:\users\dell\anaconda3\lib\site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.18.0) Requirement already satisfied: tenacity>=6.2.0 in c:\users\dell\anaconda3\lib\site-packages (from plotly>=4.8.2->giotto-tda->hepml) (8.0.1) Requirement already satisfied: wcwidth in c:\users\dell\anaconda3\lib\site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=4.0.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.2.5) Requirement already satisfied: igraph==0.9.9 in c:\users\dell\anaconda3\lib\site-packages (from python-igraph>=0.8.2->giotto-tda->hepml) (0.9.9) Requirement already satisfied: texttable>=1.6.2 in c:\users\dell\anaconda3\lib\site-packages (from igraph==0.9.9->python-igraph>=0.8.2->giotto-tda->hepml) (1.6.4) Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\dell\anaconda3\lib\site-packages (from scikit-learn->hepml) (2.2.0) Requirement already satisfied: notebook>=4.4.1 in c:\users\dell\anaconda3\lib\site-packages (from widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (6.4.5) Requirement already satisfied: argon2-cffi in c:\users\dell\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (20.1.0) Requirement already satisfied: Send2Trash>=1.5.0 in c:\users\dell\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (1.8.0) Requirement already satisfied: jinja2 in c:\users\dell\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (2.11.3) Requirement already satisfied: nbconvert in c:\users\dell\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (6.1.0) Requirement already satisfied: prometheus-client in c:\users\dell\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.11.0) Requirement already satisfied: terminado>=0.8.3 in c:\users\dell\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.9.4) Requirement already satisfied: pywinpty>=0.5 in c:\users\dell\anaconda3\lib\site-packages (from terminado>=0.8.3->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.5.7) Requirement already satisfied: cffi>=1.0.0 in c:\users\dell\anaconda3\lib\site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (1.14.6) Requirement already satisfied: pycparser in c:\users\dell\anaconda3\lib\site-packages (from cffi>=1.0.0->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (2.20) Requirement already satisfied: MarkupSafe>=0.23 in c:\users\dell\anaconda3\lib\site-packages (from jinja2->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (1.1.1) Requirement already satisfied: entrypoints>=0.2.2 in c:\users\dell\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.3) Requirement already satisfied: defusedxml in c:\users\dell\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.7.1) Requirement already satisfied: bleach in c:\users\dell\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (4.0.0) Requirement already satisfied: mistune<2,>=0.8.1 in c:\users\dell\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.8.4) Requirement already satisfied: pandocfilters>=1.4.1 in c:\users\dell\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (1.4.3) Requirement already satisfied: testpath in c:\users\dell\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.5.0) Requirement already satisfied: nbclient<0.6.0,>=0.5.0 in c:\users\dell\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.5.3) Requirement already satisfied: jupyterlab-pygments in c:\users\dell\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.1.2) Requirement already satisfied: nest-asyncio in c:\users\dell\anaconda3\lib\site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (1.5.1) Requirement already satisfied: async-generator in c:\users\dell\anaconda3\lib\site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (1.10) Requirement already satisfied: packaging in c:\users\dell\anaconda3\lib\site-packages (from bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (21.3) Requirement already satisfied: webencodings in c:\users\dell\anaconda3\lib\site-packages (from bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (0.5.1) Requirement already satisfied: fastrelease in c:\users\dell\anaconda3\lib\site-packages (from nbdev->hepml) (0.1.12) Requirement already satisfied: pip in c:\users\dell\anaconda3\lib\site-packages (from nbdev->hepml) (21.2.4) Requirement already satisfied: jupyter in c:\users\dell\anaconda3\lib\site-packages (from nbdev->hepml) (1.0.0) Requirement already satisfied: ghapi in c:\users\dell\anaconda3\lib\site-packages (from nbdev->hepml) (0.1.19) Requirement already satisfied: fastcore>=1.3.29 in c:\users\dell\anaconda3\lib\site-packages (from nbdev->hepml) (1.3.29) Requirement already satisfied: pyyaml in c:\users\dell\anaconda3\lib\site-packages (from nbdev->hepml) (6.0) Requirement already satisfied: qtconsole in c:\users\dell\anaconda3\lib\site-packages (from jupyter->nbdev->hepml) (5.1.1) Requirement already satisfied: jupyter-console in c:\users\dell\anaconda3\lib\site-packages (from jupyter->nbdev->hepml) (6.4.0) Requirement already satisfied: llvmlite<0.39,>=0.38.0rc1 in c:\users\dell\anaconda3\lib\site-packages (from numba->hepml) (0.38.0) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in c:\users\dell\anaconda3\lib\site-packages (from packaging->bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->giotto-tda->hepml) (3.0.4) Requirement already satisfied: pytz>=2017.3 in c:\users\dell\anaconda3\lib\site-packages (from pandas->hepml) (2021.3) Requirement already satisfied: qtpy in c:\users\dell\anaconda3\lib\site-packages (from qtconsole->jupyter->nbdev->hepml) (1.10.0) Requirement already satisfied: charset-normalizer~=2.0.0 in c:\users\dell\anaconda3\lib\site-packages (from requests[socks]->gdown->hepml) (2.0.4) Requirement already satisfied: certifi>=2017.4.17 in c:\users\dell\anaconda3\lib\site-packages (from requests[socks]->gdown->hepml) (2021.10.8) Requirement already satisfied: idna<4,>=2.5 in c:\users\dell\anaconda3\lib\site-packages (from requests[socks]->gdown->hepml) (2.10) Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\dell\anaconda3\lib\site-packages (from requests[socks]->gdown->hepml) (1.26.7) Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in c:\users\dell\anaconda3\lib\site-packages (from requests[socks]->gdown->hepml) (1.7.1) Requirement already satisfied: matplotlib>=2.2 in c:\users\dell\anaconda3\lib\site-packages (from seaborn->hepml) (3.4.3) Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\dell\anaconda3\lib\site-packages (from matplotlib>=2.2->seaborn->hepml) (1.3.1) Requirement already satisfied: cycler>=0.10 in c:\users\dell\anaconda3\lib\site-packages (from matplotlib>=2.2->seaborn->hepml) (0.10.0)
# data wrangling
import numpy as np
import time
import pandas as pd
from pathlib import Path
from IPython.display import YouTubeVideo
from fastprogress import progress_bar
import warnings
warnings.filterwarnings("ignore")
# hepml
from hepml.core import make_gravitational_waves, download_dataset
# tda magic
from gtda.homology import VietorisRipsPersistence, CubicalPersistence
from gtda.diagrams import PersistenceEntropy, Scaler
from gtda.plotting import plot_heatmap, plot_point_cloud, plot_diagram
from gtda.pipeline import Pipeline
from gtda.time_series import SingleTakensEmbedding
# ml tools
from sklearn.ensemble import RandomForestClassifier
from sklearn.decomposition import PCA
from sklearn.model_selection import train_test_split
from sklearn.metrics import plot_roc_curve, roc_auc_score, accuracy_score
# dataviz
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(color_codes=True)
sns.set_palette(sns.color_palette("muted"))
The first step in analysing the topology of time series is to construct a so-called sliding window embedding, invented by Takens in the 1980s. As shown in the diagram below, a sliding window embedding of a signal can be thought of as sliding a "window" of fixed size over a signal, with each window represented as a point in a (possibly) higher-dimensional space:

Figure reference: https://bit.ly/3easx09
More formally: given a time series $f(t)$, one can extract a sequence of vectors of the form $f_i = [(f(t_i)), f(t_i + 2 \tau), \ldots, f(t_i + M \tau)] \in \mathbb{R}^{M+1}$, where $M$ is the embedding dimension and $\tau$ is the time delay. The quantity $M\tau$ is known as the "window size" and the difference between $t_{i+1}$ and $t_i$ is called the stride.
The sliding window allows us to apply Takens' embedding locally on a certain interval rather than over the whole time series. The result of this procedure is a time series of point clouds with possibly interesting topologies. These topologies can be used to classify whether a signal is periodic or not.
x_periodic = np.linspace(0, 10, 1000)
y_periodic_cos5x = np.cos(5 * x_periodic)
print(f"Number of time steps: {len(y_periodic_cos5x)}")
Number of time steps: 1000
plt.figure(figsize=(20, 4))
plt.plot(x_periodic, y_periodic_cos5x)
plt.xlabel("Timestep")
plt.ylabel("Amplitude")
plt.show()
embedding_dimension_periodic = 3
embedding_time_delay_periodic = 8
stride = 10
embedder_periodic = SingleTakensEmbedding(
parameters_type="fixed",
n_jobs=2,
time_delay=embedding_time_delay_periodic,
dimension=embedding_dimension_periodic,
stride=stride,
)
y_periodic_embedded_cos5x = embedder_periodic.fit_transform(y_periodic_cos5x)
y_periodic_embedded_cos5x.shape
(99, 3)
plot_point_cloud(y_periodic_embedded_cos5x)
x_nonperiodic = np.linspace(0, 50, 1000)
y_nonperiodic_cosx_cospix = np.cos(x_nonperiodic) + np.cos(np.pi * x_nonperiodic)
print(f"Number of time steps: {len(y_nonperiodic_cosx_cospix)}")
Number of time steps: 1000
plt.figure(figsize=(8, 4))
plt.plot(x_nonperiodic, y_nonperiodic_cosx_cospix)
plt.xlabel("Amplitude")
plt.ylabel("Timestep")
plt.show()
embedding_dimension_nonperiodic = 3
embedding_time_delay_nonperiodic = 16
stride = 3
embedder_nonperiodic = SingleTakensEmbedding(
parameters_type="fixed",
n_jobs=2,
time_delay=embedding_time_delay_nonperiodic,
dimension=embedding_dimension_nonperiodic,
stride=stride,
)
y_nonperiodic_embedded_cosx_cospix = embedder_nonperiodic.fit_transform(y_nonperiodic_cosx_cospix)
y_nonperiodic_embedded_cosx_cospix.shape
(323, 3)
plot_point_cloud(y_nonperiodic_embedded_cosx_cospix)
x_periodic = np.linspace(0, 10, 1000)
y_periodic_sinx = np.sin(x_periodic)
print(f"Number of time steps: {len(y_periodic_sinx)}")
Number of time steps: 1000
plt.figure(figsize=(20, 4))
plt.plot(x_periodic, y_periodic_sinx)
plt.xlabel("Timestep")
plt.ylabel("Amplitude")
plt.show()
embedding_dimension_periodic = 3
embedding_time_delay_periodic = 8
stride = 10
embedder_periodic = SingleTakensEmbedding(
parameters_type="fixed",
n_jobs=2,
time_delay=embedding_time_delay_periodic,
dimension=embedding_dimension_periodic,
stride=stride,
)
y_periodic_embedded_sinx = embedder_periodic.fit_transform(y_periodic_sinx)
y_periodic_embedded_sinx.shape
(99, 3)
plot_point_cloud(y_periodic_embedded_sinx)
x_periodic = np.linspace(0, 10, 1000)
y_periodic_sin5x = np.sin(5*x_periodic)
print(f"Number of time steps: {len(y_periodic_sin5x)}")
plt.figure(figsize=(20, 4))
plt.plot(x_periodic, y_periodic_sin5x)
plt.xlabel("Timestep")
plt.ylabel("Amplitude")
plt.show()
embedding_dimension_periodic = 3
embedding_time_delay_periodic = 8
stride = 10
embedder_periodic = SingleTakensEmbedding(
parameters_type="fixed",
n_jobs=2,
time_delay=embedding_time_delay_periodic,
dimension=embedding_dimension_periodic,
stride=stride,
)
y_periodic_embedded_sin5x = embedder_periodic.fit_transform(y_periodic_sin5x)
print(y_periodic_embedded_sin5x.shape)
plot_point_cloud(y_periodic_embedded_sin5x)
Number of time steps: 1000
(99, 3)
x_periodic = np.linspace(0, 10, 1000)
y_periodic_sinx_plus_cosx = np.sin(x_periodic)+np.cos(x_periodic)
print(f"Number of time steps: {len(y_periodic_sinx_plus_cosx)}")
plt.figure(figsize=(20, 4))
plt.plot(x_periodic, y_periodic_sinx_plus_cosx)
plt.xlabel("Timestep")
plt.ylabel("Amplitude")
plt.show()
embedding_dimension_periodic = 3
embedding_time_delay_periodic = 8
stride = 10
embedder_periodic = SingleTakensEmbedding(
parameters_type="fixed",
n_jobs=2,
time_delay=embedding_time_delay_periodic,
dimension=embedding_dimension_periodic,
stride=stride,
)
y_periodic_embedded_sinx_plus_cosx = embedder_periodic.fit_transform(y_periodic_sinx_plus_cosx)
print(y_periodic_embedded_sinx_plus_cosx.shape)
plot_point_cloud(y_periodic_embedded_sinx_plus_cosx)
Number of time steps: 1000
(99, 3)
x_nonperiodic = np.linspace(0, 50, 1000)
y_nonperiodic_xcosx = x_nonperiodic*np.cos(x_nonperiodic)
print(f"Number of time steps: {len(y_nonperiodic_xcosx)}")
plt.figure(figsize=(8, 4))
plt.plot(x_nonperiodic, y_nonperiodic_xcosx)
plt.xlabel("Amplitude")
plt.ylabel("Timestep")
plt.show()
embedding_dimension_nonperiodic = 3
embedding_time_delay_nonperiodic = 16
stride = 3
embedder_nonperiodic = SingleTakensEmbedding(
parameters_type="fixed",
n_jobs=2,
time_delay=embedding_time_delay_nonperiodic,
dimension=embedding_dimension_nonperiodic,
stride=stride,
)
y_nonperiodic_embedded_xcosx = embedder_nonperiodic.fit_transform(y_nonperiodic_xcosx)
print(y_nonperiodic_embedded_xcosx.shape)
plot_point_cloud(y_nonperiodic_embedded_xcosx)
Number of time steps: 1000
(323, 3)
x_nonperiodic = np.linspace(0, 50, 1000)
y_nonperiodic_esin = np.exp(0.2*x_nonperiodic) * np.sin(10*x_periodic)
print(f"Number of time steps: {len(y_nonperiodic_esin)}")
plt.figure(figsize=(8, 4))
plt.plot(x_nonperiodic, y_nonperiodic_esin)
plt.xlabel("Amplitude")
plt.ylabel("Timestep")
plt.show()
embedding_dimension_nonperiodic = 3
embedding_time_delay_nonperiodic = 16
stride = 3
embedder_nonperiodic = SingleTakensEmbedding(
parameters_type="fixed",
n_jobs=2,
time_delay=embedding_time_delay_nonperiodic,
dimension=embedding_dimension_nonperiodic,
stride=stride,
)
y_nonperiodic_embedded_esin = embedder_nonperiodic.fit_transform(y_nonperiodic_esin)
print(y_nonperiodic_embedded_esin.shape)
plot_point_cloud(y_nonperiodic_embedded_esin)
Number of time steps: 1000
(323, 3)
Fs = 60
time_step = 1/Fs
x_nonperiodic_chirp = np.arange(0,50,time_step)
y_nonperiodic_chirp = np.sin(0.5*np.pi*x_nonperiodic_chirp*(1+0.1*x_nonperiodic_chirp))
print(f"Number of time steps: {len(y_nonperiodic_chirp)}")
plt.figure(figsize=(8, 4))
plt.plot(x_nonperiodic_chirp, y_nonperiodic_chirp)
plt.xlabel("Amplitude")
plt.ylabel("Timestep")
plt.show()
embedding_dimension_nonperiodic = 3
embedding_time_delay_nonperiodic = 16
stride = 3
embedder_nonperiodic = SingleTakensEmbedding(
parameters_type="fixed",
n_jobs=2,
time_delay=embedding_time_delay_nonperiodic,
dimension=embedding_dimension_nonperiodic,
stride=stride,
)
y_nonperiodic_embedded_chirp = embedder_nonperiodic.fit_transform(y_nonperiodic_chirp)
print(y_nonperiodic_embedded_chirp.shape)
plot_point_cloud(y_nonperiodic_embedded_chirp)
Number of time steps: 3000
(990, 3)
(n_samples, n_points, n_dimensions):¶y_periodic_embedded_cos5x = y_periodic_embedded_cos5x[None, :, :]
y_nonperiodic_embedded_cosx_cospix = y_nonperiodic_embedded_cosx_cospix[None, :, :]
# check shapes
y_periodic_embedded_cos5x.shape, y_nonperiodic_embedded_cosx_cospix.shape
((1, 99, 3), (1, 323, 3))
homology_dimensions = [0, 1, 2]
periodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time periodic_persistence_diagrams = periodic_persistence.fit_transform(y_periodic_embedded_cos5x)
Wall time: 2.09 s
nonperiodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time nonperiodic_persistence_diagrams = nonperiodic_persistence.fit_transform(y_nonperiodic_embedded_cosx_cospix)
Wall time: 2.19 s
plot_diagram(periodic_persistence_diagrams[0])
plot_diagram(nonperiodic_persistence_diagrams[0])
y_periodic_embedded_sinx = y_periodic_embedded_sinx[None, :, :]
# check shapes
print(y_periodic_embedded_sinx.shape)
homology_dimensions = [0, 1, 2]
periodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time periodic_persistence_diagrams = periodic_persistence.fit_transform(y_periodic_embedded_sinx)
plot_diagram(periodic_persistence_diagrams[0])
(1, 99, 3) Wall time: 1.49 s
y_periodic_embedded_sin5x = y_periodic_embedded_sin5x[None, :, :]
# check shapes
print(y_periodic_embedded_sin5x.shape)
homology_dimensions = [0, 1, 2]
periodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time periodic_persistence_diagrams = periodic_persistence.fit_transform(y_periodic_embedded_sin5x)
plot_diagram(periodic_persistence_diagrams[0])
(1, 99, 3) Wall time: 1.42 s
y_periodic_embedded_sinx_plus_cosx = y_periodic_embedded_sinx_plus_cosx[None, :, :]
# check shapes
print(y_periodic_embedded_sinx_plus_cosx.shape)
homology_dimensions = [0, 1, 2]
periodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time periodic_persistence_diagrams = periodic_persistence.fit_transform(y_periodic_embedded_sinx_plus_cosx)
plot_diagram(periodic_persistence_diagrams[0])
(1, 99, 3) Wall time: 1.36 s
y_nonperiodic_embedded_xcosx = y_nonperiodic_embedded_xcosx[None, :, :]
# check shapes
print(y_nonperiodic_embedded_xcosx.shape)
homology_dimensions = [0, 1, 2]
nonperiodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time nonperiodic_persistence_diagrams = nonperiodic_persistence.fit_transform(y_nonperiodic_embedded_xcosx)
plot_diagram(nonperiodic_persistence_diagrams[0])
(1, 323, 3) Wall time: 5.93 s
y_nonperiodic_embedded_esin = y_nonperiodic_embedded_esin[None, :, :]
# check shapes
print(y_nonperiodic_embedded_esin.shape)
homology_dimensions = [0, 1, 2]
nonperiodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time nonperiodic_persistence_diagrams = nonperiodic_persistence.fit_transform(y_nonperiodic_embedded_esin)
plot_diagram(nonperiodic_persistence_diagrams[0])
(1, 323, 3) Wall time: 9.28 s
y_nonperiodic_embedded_chirp = y_nonperiodic_embedded_chirp[None, :, :]
# check shapes
print(y_nonperiodic_embedded_chirp.shape)
homology_dimensions = [0, 1, 2]
nonperiodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time nonperiodic_persistence_diagrams = nonperiodic_persistence.fit_transform(y_nonperiodic_embedded_chirp)
plot_diagram(nonperiodic_persistence_diagrams[0])
(1, 990, 3) Wall time: 31.5 s
In the examples above, we have manually chosen values for the embedding dimension $M$ and time delay $\tau$. However, it turns out there are two techniques that can be used to determine these parameters automatically:
In giotto-tda, these techniques are applied when we select parameters_type="search" in the SingleTakensEmbedding transformer, e.g.
embedder = SingleTakensEmbedding(
parameters_type="search", time_delay=time_delay, dimension=embedding_dimension,
)
where the values of time_delay and embedding_dimension provide upper bounds on the search algorithm. Before applying this to our sample signals, let's have a look at how these methods actually work under the hood.
To determine an optimal value for $\tau$ we first calculate the maximum $x_\mathrm{max}$ and minimum $x_\mathrm{min}$ values of the time series, and divide the interval $[x_\mathrm{min}, x_\mathrm{max}]$ into a large number of bins. We let $p_k$ be the probability that an element of the time series is in the $k$th bin and let $p_{j,k}$ be the probability that $x_i$ is in the $j$th bin while $x_{i+\tau}$ is in the $k$th bin. Then the mutual information is defined as:
$$ I(\tau) = - \sum_{j=1}^{n_\mathrm{bins}} \sum_{k=1}^{n_\mathrm{bins}} p_{j,k}(\tau) \log \frac{p_{j,k}(\tau)}{p_j p_k} $$The first minimum of $I(\tau)$ gives the optimal delay since there we get the most information by adding $x_{i+\tau}$.
The false nearest neighbours algorithm is based on the assumption that "unfolding" or embedding a deterministic system into successively higher dimensions is smooth. In other words, points which are close in one embedding dimension should be close in a higher one. More formally, if we have a point $p_i$ and neighbour $p_j$, we check if the normalised distance $R_i$ for the next dimension is greater than some threshold $R_\mathrm{th}$:
$$ R_i = \frac{\mid x_{i+m\tau} - x_{j+m\tau} \mid}{\lVert p_i - p_j \rVert} > R_\mathrm{th}$$If $R_i > R_\mathrm{th}$ then we have a "false nearest neighbour" and the optimal embedding dimension is obtained by minimising the total number of such neighbours.
Let's now apply these ideas to our original signals to see what the algorithm determines as optimal choices for $M$ and $\tau$. We will allow the search to scan up to relatively large values of $(M, \tau)$ to ensure we do not get stuck in a sub-optimal minimum.
max_embedding_dimension = 30
max_time_delay = 30
stride = 5
embedder_periodic = SingleTakensEmbedding(
parameters_type="search", time_delay=max_time_delay, dimension=max_embedding_dimension, stride=stride
)
def fit_embedder(embedder, y, verbose=True):
y_embedded = embedder.fit_transform(y)
if verbose:
print(f"Shape of embedded time series: {y_embedded.shape}")
print(f"Optimal embedding dimension is {embedder.dimension_} and time delay is {embedder.time_delay_}")
return y_embedded
y_periodic_embedded_cos5x = fit_embedder(embedder_periodic, y_periodic_cos5x)
Shape of embedded time series: (171, 6) Optimal embedding dimension is 6 and time delay is 29
pca = PCA(n_components=3)
y_periodic_embedded_pca = pca.fit_transform(y_periodic_embedded_cos5x)
plot_point_cloud(y_periodic_embedded_pca)
embedder_nonperiodic = SingleTakensEmbedding(
parameters_type="search", n_jobs=2, time_delay=max_time_delay, dimension=max_embedding_dimension, stride=stride,
)
y_nonperiodic_embedded_cosx_cospix = fit_embedder(embedder_nonperiodic, y_nonperiodic_cosx_cospix)
Shape of embedded time series: (186, 6) Optimal embedding dimension is 6 and time delay is 14
pca = PCA(n_components=3)
y_nonperiodic_embedded_pca = pca.fit_transform(y_nonperiodic_embedded_cosx_cospix)
plot_point_cloud(y_nonperiodic_embedded_pca)
y_periodic_embedded_cos5x = y_periodic_embedded_cos5x[None, :, :]
y_nonperiodic_embedded_cosx_cospix = y_nonperiodic_embedded_cosx_cospix[None, :, :]
# check shapes
print(y_periodic_embedded_cos5x.shape, y_nonperiodic_embedded_cosx_cospix.shape)
(1, 171, 6) (1, 186, 6)
homology_dimensions = [0, 1, 2]
periodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time periodic_persistence_diagrams = periodic_persistence.fit_transform(y_periodic_embedded_cos5x)
Wall time: 2.42 s
nonperiodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time nonperiodic_persistence_diagrams = nonperiodic_persistence.fit_transform(y_nonperiodic_embedded_cosx_cospix)
Wall time: 1.72 s
plot_diagram(periodic_persistence_diagrams[0])
plot_diagram(nonperiodic_persistence_diagrams[0])
max_embedding_dimension = 30
max_time_delay = 30
stride = 5
embedder_periodic = SingleTakensEmbedding(
parameters_type="search", time_delay=max_time_delay, dimension=max_embedding_dimension, stride=stride
)
y_periodic_embedded_sinx = fit_embedder(embedder_periodic, y_periodic_sinx)
pca = PCA(n_components=3)
y_periodic_embedded_pca = pca.fit_transform(y_periodic_embedded_sinx)
plot_point_cloud(y_periodic_embedded_pca)
Shape of embedded time series: (168, 7) Optimal embedding dimension is 7 and time delay is 27
y_periodic_embedded_sinx = y_periodic_embedded_sinx[None, :, :]
# check shapes
print(y_periodic_embedded_sinx.shape)
homology_dimensions = [0, 1, 2]
periodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time periodic_persistence_diagrams = periodic_persistence.fit_transform(y_periodic_embedded_sinx)
plot_diagram(periodic_persistence_diagrams[0])
(1, 168, 7) Wall time: 1.55 s
max_embedding_dimension = 30
max_time_delay = 30
stride = 5
embedder_periodic = SingleTakensEmbedding(
parameters_type="search", time_delay=max_time_delay, dimension=max_embedding_dimension, stride=stride
)
y_periodic_embedded_sin5x = fit_embedder(embedder_periodic, y_periodic_sin5x)
pca = PCA(n_components=3)
y_periodic_embedded_pca = pca.fit_transform(y_periodic_embedded_sin5x)
plot_point_cloud(y_periodic_embedded_pca)
Shape of embedded time series: (169, 7) Optimal embedding dimension is 7 and time delay is 26
y_periodic_embedded_sin5x = y_periodic_embedded_sin5x[None, :, :]
# check shapes
print(y_periodic_embedded_sin5x.shape)
homology_dimensions = [0, 1, 2]
periodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time periodic_persistence_diagrams = periodic_persistence.fit_transform(y_periodic_embedded_sin5x)
plot_diagram(periodic_persistence_diagrams[0])
(1, 169, 7) Wall time: 1.71 s
max_embedding_dimension = 30
max_time_delay = 30
stride = 5
embedder_periodic = SingleTakensEmbedding(
parameters_type="search", time_delay=max_time_delay, dimension=max_embedding_dimension, stride=stride
)
y_periodic_embedded_sinx_plus_cosx = fit_embedder(embedder_periodic, y_periodic_sinx_plus_cosx)
pca = PCA(n_components=3)
y_periodic_embedded_pca = pca.fit_transform(y_periodic_embedded_sinx_plus_cosx)
plot_point_cloud(y_periodic_embedded_pca)
Shape of embedded time series: (168, 7) Optimal embedding dimension is 7 and time delay is 27
y_periodic_embedded_sinx_plus_cosx = y_periodic_embedded_sinx_plus_cosx[None, :, :]
# check shapes
print(y_periodic_embedded_sinx_plus_cosx.shape)
homology_dimensions = [0, 1, 2]
periodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time periodic_persistence_diagrams = periodic_persistence.fit_transform(y_periodic_embedded_sinx_plus_cosx)
plot_diagram(periodic_persistence_diagrams[0])
(1, 168, 7) Wall time: 1.59 s
embedder_nonperiodic = SingleTakensEmbedding(
parameters_type="search", n_jobs=2, time_delay=max_time_delay, dimension=max_embedding_dimension, stride=stride,
)
y_nonperiodic_embedded_xcosx = fit_embedder(embedder_nonperiodic, y_nonperiodic_xcosx)
pca = PCA(n_components=3)
y_nonperiodic_embedded_pca = pca.fit_transform(y_nonperiodic_embedded_xcosx)
plot_point_cloud(y_nonperiodic_embedded_pca)
Shape of embedded time series: (170, 8) Optimal embedding dimension is 8 and time delay is 22
y_nonperiodic_embedded_xcosx = y_nonperiodic_embedded_xcosx[None, :, :]
print(y_nonperiodic_embedded_xcosx.shape)
nonperiodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time nonperiodic_persistence_diagrams = nonperiodic_persistence.fit_transform(y_nonperiodic_embedded_xcosx)
plot_diagram(nonperiodic_persistence_diagrams[0])
(1, 170, 8) Wall time: 2.37 s
embedder_nonperiodic = SingleTakensEmbedding(
parameters_type="search", n_jobs=2, time_delay=max_time_delay, dimension=max_embedding_dimension, stride=stride,
)
y_nonperiodic_embedded_esin = fit_embedder(embedder_nonperiodic, y_nonperiodic_esin)
pca = PCA(n_components=3)
y_nonperiodic_embedded_pca = pca.fit_transform(y_nonperiodic_embedded_esin)
plot_point_cloud(y_nonperiodic_embedded_pca)
Shape of embedded time series: (180, 9) Optimal embedding dimension is 9 and time delay is 13
y_nonperiodic_embedded_esin = y_nonperiodic_embedded_esin[None, :, :]
print(y_nonperiodic_embedded_esin.shape)
nonperiodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time nonperiodic_persistence_diagrams = nonperiodic_persistence.fit_transform(y_nonperiodic_embedded_esin)
plot_diagram(nonperiodic_persistence_diagrams[0])
(1, 180, 9) Wall time: 3.03 s
embedder_nonperiodic = SingleTakensEmbedding(
parameters_type="search", n_jobs=2, time_delay=max_time_delay, dimension=max_embedding_dimension, stride=stride,
)
y_nonperiodic_embedded_chirp = fit_embedder(embedder_nonperiodic, y_nonperiodic_chirp)
pca = PCA(n_components=3)
y_nonperiodic_embedded_pca = pca.fit_transform(y_nonperiodic_embedded_chirp)
plot_point_cloud(y_nonperiodic_embedded_pca)
Shape of embedded time series: (580, 11) Optimal embedding dimension is 11 and time delay is 10
y_nonperiodic_embedded_chirp = y_nonperiodic_embedded_chirp[None, :, :]
print(y_nonperiodic_embedded_chirp.shape)
nonperiodic_persistence = VietorisRipsPersistence(homology_dimensions=homology_dimensions, n_jobs=6)
%time nonperiodic_persistence_diagrams = nonperiodic_persistence.fit_transform(y_nonperiodic_embedded_chirp)
plot_diagram(nonperiodic_persistence_diagrams[0])
(1, 580, 11) Wall time: 49.2 s